Surrogate Learning - An Approach for Semi-Supervised Classification
نویسندگان
چکیده
We consider the task of learning a classifier from the feature space X to the set of classes Y = {0, 1}, when the features can be partitioned into class-conditionally independent feature sets X 1 and X 2. We show the surprising fact that the class-conditional independence can be used to represent the original learning task in terms of 1) learning a classifier from X 2 to X 1 and 2) learning the class-conditional distribution of the feature set X 1. This fact can be exploited for semi-supervised learning because the former task can be accomplished purely from unlabeled samples. We present experimental evaluation of the idea in two real world applications .
منابع مشابه
Semi-Supervised Learning Based Prediction of Musculoskeletal Disorder Risk
This study explores a semi-supervised classification approach using random forest as a base classifier to classify the low-back disorders (LBDs) risk associated with the industrial jobs. Semi-supervised classification approach uses unlabeled data together with the small number of labelled data to create a better classifier. The results obtained by the proposed approach are compared with those o...
متن کاملThe Pessimistic Limits of Margin-based Losses in Semi-supervised Learning
We show that for linear classifiers defined by convex marginbased surrogate losses that are monotonically decreasing, it is impossible to construct any semi-supervised approach that is able to guarantee an improvement over the supervised classifier measured by this surrogate loss. For non-monotonically decreasing loss functions, we demonstrate safe improvements are possible.
متن کاملDetecting Concept Drift in Data Stream Using Semi-Supervised Classification
Data stream is a sequence of data generated from various information sources at a high speed and high volume. Classifying data streams faces the three challenges of unlimited length, online processing, and concept drift. In related research, to meet the challenge of unlimited stream length, commonly the stream is divided into fixed size windows or gradual forgetting is used. Concept drift refer...
متن کاملDelft University of Technology Projected estimators for robust semi-supervised classification
For semi-supervised techniques to be applied safely in practice we at least want methods to outperform their supervised counterparts.We study this question for classification using the well-known quadratic surrogate loss function. Unlike other approaches to semisupervised learning, the procedure proposed in this work does not rely on assumptions that are not intrinsic to the classifier at hand....
متن کاملOn Measuring and Quantifying Performance: Error Rates, Surrogate Loss, and an Example in SSL
In various approaches to learning, notably in domain adaptation, active learning, learning under covariate shift, semi-supervised learning, learning with concept drift, and the like, one often wants to compare a baseline classifier to one or more advanced (or at least different) strategies. In this chapter, we basically argue that if such classifiers, in their respective training phases, optimi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/0809.4632 شماره
صفحات -
تاریخ انتشار 2008